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(54) Image processing apparatus and method 

(57) Position, size and attribute regarding one or a 
plurality of areas in an image are held as template infor- 
mation. Block areas such as text areas and figure areas 
are extracted from a document image that has entered 
from a scanner and an attribute is added to each block 
area. A block area that at least partially overlaps an 
area indicated by the template information and whose 

FIG.1 
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attribute agrees with the attribute included in the tem- 
plate information is acquired as extracted information 
from the block areas that have been extracted. This 
makes it possible to reliably extract a desired area from 
an entered document image while employing a tem- 
plate. 
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Description 

BACKGROUND OF THE INVENTION 

[0001] This invention is applicable to electronic equip- 
ment such as an OCR (optical character reader), copier, 
facsimile machine or processor for implementing an 
electronic database and, more particularly, relates to an 
image processing apparatus and method for extracting 
a specific desired area from a document image. 
[0002] Two methods of extracting a desired area from 
a document are available. The first method is such that 
whenever the operator wishes to extract a desired area, 
the operator designates this area in an input image 
each time. This method involves reading the document 
image using a scanner, displaying the scanned image 
on a display monitor and having the operator designate 
the desired area using a mouse or the like. 
[0003] The second method involves creating a tem- 
plate for which size and position information represent- 
ing rectanglar areas has been decided in advance, 
applying the rectangular areas decided by the template 
directly to an input image and then extracting these 
areas from the input image. In this case rectangular 
areas whose positions and sizes have been decided by 
the template are extracted from a scanned document 
image and the operator need no longer perform the 
laborious task of specifying extraction areas one after 
another. 

[0004] The first method is disadvantageous in that the 
operator must specify the desired area each time. This 
method, therefore, is not suited to the processing of a 
large number of documents. The second method using 
the template is disadvantages in that if there is a dispar- 
ity in position or size between an area to be extracted 
from the input image and the rectangular area decided 
by the template, the area to be extracted may be omitted 
in the extraction process. 

SUMMARY OF THE INVENTION 

[0005] The present invention has been devised in view 
of the foregoing problems and a concern is to provide 
an image processing apparatus and method whereby it 
is possible to extract a desired area from a document 
image in reliable fashion. 

[0006] Another concern of the present invention is to 
make possible the rapid and reliable extraction of a 
desired area from a large quantity of document images. 
[0007] A further concern the present invention is to 
provide an image processing apparatus and method 
whereby it is possible to reliably extract a desired area 
from an entered document image while employing a 
template. 

[0008] An image processing apparatus according to 
one aspect the present invention 
comprises: holding means for holding position, size and 
attribute as template information in regard to one or a 



plurality of areas in an image; image input means for 
inputting a document image; first extraction means for 
extracting block areas from the document image input 
by the image input means and evaluating attributes of 
5 the extracted block areas; and second extraction means 
for extracting, from block areas that have been extracted 
by the first extraction means, a block area that at least 
partially overlaps an area indicated by the template 
information and whose attribute agrees with the 
io attribute included in the template information. 

[0009] An image processing method according to 
another aspect of the present invention 
comprises: a holding step of holding position, size and 
attribute as template information in regard to one or a 
is plurality of areas in an image; an image input step of 
inputting a document image; a first extraction step of 
extracting block areas from the document image input at 
the image input step and evaluating attribute of the 
extracted block areas; and a second extraction step of 
20 extracting, from block areas that have been extracted at 
said first extraction step, a block area that at least par- 
tially overlaps an area indicated by the template infor- 
mation and whose attribute agrees with the attribute 
included in the template information. 
25 [001 0] Other features and advantages of the present 
invention will be apparent from the following description 
taken in conjunction with the accompanying drawings, 
in which like reference characters designate the same 
or similar parts throughout the figures thereof. 

30 

BRIEF DESCRIPTION OF THE DRAWINGS 

[001 1 ] The accompanying drawings, which are incor- 
porated in and constitute a part of the specification, 
35 illustrate embodiments of the invention and, together 
with the description, serve to explain the principles of 
the invention. 

Fig. 1 is a block diagram illustrating the construction 
40 of an image processing apparatus according to a 
first embodiment of the present invention; 
Fig. 2 is a flowchart for describing the procedure of 
template save processing according to the first 
embodiment; 

45 Fig. 3 is a flowchart for describing the procedure of 
area extraction processing according to the first 
embodiment; 

Fig. 4 is a diagram showing an example of a docu- 
ment (document A) read in for the purpose of gen- 
so erating template data; 

Fig. 5 is a diagram showing an example of the dis- 
play of a screen for setting areas in template save 
processing according to the first embodiment; 
Figs. 6A, 6B are diagrams useful in describing the 
55 data structure of template data generated by desig- 
nation of areas and setting of attributes; 
Fig. 7 is a diagram for describing the registered 
state of template data; 
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Fig. 8 is a diagram illustrating the manner in which 
a "Document A Template" is registered in the tem- 
plate data; 

Fig. 9 is a diagram showing a document B, which is 
an example of a document to be processed; 
Fig. 10 is a diagram showing an example of results 
obtained by executing area partitioning processing 
in regard to the document B of Fig. 9; 
Fig. 1 1 is a diagram useful in describing the manner 
in which a template and the blocks of a document 
are compared; 

Fig. 12 is a flowchart illustrating the procedure of 
template save processing according to a second 
embodiment of the present invention; 
Fig. 13 is a diagram showing an example of results 
of area partitioning processing in this case; 
Fig. 14 is a flowchart illustrating the procedure of 
template save processing according to a third 
embodiment of the present invention; 
Fig. 15 is a diagram showing a state in which a 
block 3 and a block 5 have been selected in tem- 
plate save processing according to the third 
embodiment; 

Fig. 16 is a diagram showing the data structure of 
the document A template in the third embodiment; 
Fig. 1 7 is a diagram, useful in describing results of 
comparing blocks in a case where a desired area is 
extracted from the document B using the document 
A template of Fig. 16; 

Fig. 18 is a flowchart illustrating the procedure of 
template save processing according to a fourth 
embodiment of the present invention; and 
Fig. 1 9 is a flowchart for describing the procedure of 
area extraction processing according to a fifth 
embodiment of the present invention. 

DESCRIPTION OF THE PREFERRED EMBODI- 
MENTS 

[0012] Preferred embodiments of the present inven- 
tion will now be described in detail in accordance with 
the accompanying drawings. 

[First Embodiment] 

[0013] Fig. 1 is a block diagram illustrating the con- 
struction of an image processing apparatus according 
to a first embodiment of the present invention. The 
apparatus includes a scanner 101 for irradiating a docu- 
ment having an image, reading the reflected light and 
converting the reflected light to an electric signal; a 
scanner interface 102 for converting the electric signal 
obtained by the scanner 101 to a binary digital signal 
and sending this signal to other components; a pointing 
device 103 (since a mouse is used in this embodiment, 
the device will be referred to as a mouse below) for 
entering desired coordinates on the window of a dis- 
play; an interface circuit 104 for receiving a signal from 
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the mouse 103 and transmitting the signal to other com- 
ponents; a CPU 105 for executing overall control of the 
apparatus and processing such as area partitioning; a 
ROM 106 storing programs, which are executed by the 

5 CPU 105, for various control operations and for various 
processing, as well as font data; a RAM 107 used as a 
working area for expanding a document image and for 
area partitioning processing; a display 108 for display- 
ing input images and the like, wherein an image dis- 

w played is stored in a VRAM area created by a 
prescribed address area in the RAM 107; an external 
storage device 110 such as a hard disk in which data 
and the like is stored; and an interface 111 for the exter- 
nal storage device 110. These components are inter- 

75 connected by a bus 112. 

[0014] The flow of processing according to the first 
embodiment will now be described in accordance with 
the flowcharts of Figs. 2 and 3. Fig. 2 is a flowchart use- 
ful in describing the procedure of processing for saving 

20 a template according to the first embodiment, and Fig. 3 
is a flowchart useful in describing the procedure of 
processing for extracting an area according to the first 
embodiment. 

[0015] Processing for saving a template used in area 
25 extraction will be described first with reference to Fig. 2. 
A document A of the kind shown in Fig. 4 having a for- 
mat desired to be saved is read and converted to binary 
image data by a scanner 101 at step S201 . Next at step 
S201, small areas (referred to simply as "areas" or 

30 "blocks" below) having attributes such as "text", "table" 
and "figure" are set on the input image obtained. 

[001 6] According to this embodiment, the document A 
of Fig. 4 is read by the scanner and the read image is 
displayed on the display 108. Fig. 5 is a diagram show- 

35 ing the display of a screen for setting areas in template 
save processing according to the first embodiment. An 
attribute menu 51 is displayed together with the image 
of document A on the display 108. The operator uses 
the mouse 103 to select a desired attribute from the 

40 attribute menu 51 and to designate frames indicative of 
rectangular areas. By thus causing the frames of 
selected attributes to be displayed at desired positions, 
attributes are set in regard to each rectangular area. 
When a desired attribute is selected from the attribute 

45 menu 51, the color of the border of the rectangular 
frame displayed by operating the mouse 103 is set to a 
color that is associated with the selected attribute. In 
Fig. 5, the border color of a rectangular frame 501 is 
black, which indicates that the attribute of this area is 

so "text". The border color of a rectangular frame 502 is 
red, which indicates that the attribute of this area is 
"table". The border color of a rectangular frame 504 is 
yellow, which indicates that the attribute of this area is 
ligure". 

55 [0017] Figs. 6A, 6B are diagrams useful in describing 
the data structure of template data generated by desig- 
nation of areas and setting of attributes. Area data is 
stored on an area-by-area (block-by-block) basis in the 
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manner shown in Fig. 6A. The area data is obtained by 
registering starting-point coordinates X, Y, width and 
height of the area (namely position information indica- 
tive of the rectangular frame displayed by operating the 
mouse), and includes an "attribute" field in which an 
identification number corresponding to the attribute set 
for the above-mentioned rectangular frame is set. Each 
attribute and its identification number are as shown in 
Fig. 6B. It should be noted that the X, Y coordinates of 
the starting point represent the upper left corner of the 
particular area. 

[0018] Next, at step S203, the assemblage of area 
data of each of the blocks set in the manner described 
above is registered and saved as a template. For exam- 
ple, Fig. 7 shows an example in which a template is reg- 
istered anew under the title "Document A Template" as 
the fourth of a group of already existing templates. 
[0019] Fig. 8 is a diagram illustrating the manner in 
which a "Document A Template" is registered in the 
template data. Since the areas indicated by the rectan- 
gular frames 501 - 505 have been set on the image of 
document A in the manner shown in Fig. 5 t five blocks 
are registered in the document A template. Which rec- 
tangular frame corresponds to which block of the blocks 
1 - 5 may be decided, by way of example, in accordance 
with the order in which the rectangular frames were des- 
ignated at step S202. 

[0020] Processing for extracting desired areas from a 
newly read image using a template registered in the 
manner described above will now be explained with ref- 
erence to the flowchart of Fig. 3. 
[0021] First, a template necessary to extract desired 
areas is selected at step S301. In this embodiment, the 
template names of templates registered as shown in 
Fig. 7 are displayed on the display 1 08 and the operator 
is allowed to select the desired template using the 
mouse 103. Next, at step S302, a document to undergo 
processing is read by the scanner 101 and converted to 
binary image data in order that the area partitioning 
processing, described later, may be executed. This 
example will be described on the assumption that the 
document A template set at shown in Fig. 5 has been 
selected and that a document B, shown in Fig. 9, has 
been read in as the document to be processed. 
[0022] This is followed by step S303, at which the 
input image obtained is subjected to area partitioning 
processing known to those skilled in the art, blocks are 
extracted and the attribute of each block is evaluated. 
Fig. 10 is a diagram showing an example of results 
obtained by executing area partitioning processing in 
regard to the document B. Each block of the blocks A - 
E is stored as extracted area information, with the data 
structure of the stored blocks being the same as that of 
the area information shown in Figs. 6A. 6B. In other 
words, information representing the position, size and 
attribute of each extracted block is stored. 
[0023] Next, at step S304, area data of blocks that 
have been extracted from document B are compared 



with the area data of blocks that have been saved in a 
selected template (the document A template). This is 
followed at step S305 by the extraction of a block whose 
area at least partially overlaps the area of a block in the 

s template and has the same attribute as that of the area 
it overlaps. If such a block is extracted at step S305, 
then this block is deemed to be a block identical with the 
desired block and the image contained in the area of 
this block is output at step S306. 

10 [0024] Fig. 11 is a diagram useful in describing the 
manner in which a template and the blocks of a docu- 
ment are compared. Blocks that have been recorded as 
the document A template are indicated by dashed lines 
in Fig. 1 1 , and blocks that have been extracted from the 

is image of document B are indicated by broken lines. 
Though blocks A and B that have been extracted from 
document B are somewhat displaced from the block 
positions of the document A template, the blocks A and 
B have portions that overlap the block positions of the 

20 template. The attribute of both of these areas is "text". 
Blocks A and B, therefore, are extracted at step S305. 
The end result is that the area data of all blocks A - E of 
document B obtained in Fig. 1 0 is output. 
[0025] Thus, in accordance with the first embodiment, 

25 as described above, even if the position and size of an 
area set in a template differ slightly from the position 
and size of an area to be extracted from a document 
image that has actually been read, a desired area can 
be extracted from the document image reliably. 

30 [0026] At step S305 described above, a block 
extracted at least partially overlaps an area set in the 
template and has the same attribute as that of this area. 
However, whether or not a block is to be extracted may 
be decided upon taking into account the degree of over- 

35 lap between the two blocks. For example, it may be so 
arranged that a block to be selected must overlap a 
block in the template by 70% or more and must have the 
same attribute. Furthermore, an arrangement may be 
adopted in which this ratio can be set for each block of 

40 the template. 

[Second Embodiment] 

[00271 According to the first embodiment described 
45 above, the setting of areas saved in a template is per- 
formed manually using the mouse 103. However, area 
setting processing for saving a template can be auto- 
mated using area partitioning processing of the kind 
executed at step S303. 
so [0028] The flow of such processing will be described 
in accordance with the flowchart of Fig. 12. Fig. 12 is a 
flowchart illustrating the procedure of template save 
processing according to the second embodiment of the 
present invention. 
55 [0029] The document A (see Fig. 4) having the format 
desired to be saved is read in and converted to binary 
image data, which is for area partitioning processing 
described below, by the scanner at step Si 201. The 
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input image obtained is then subjected to area partition- 
ing processing at step S1202 to extract various areas 
(blocks) such as a text area, figure area, table area and 
title area. It should be noted that the area partitioning 
processing used at step S1202 may employ a technique 
will Known to those skilled in the art. The result of this 
area partitioning processing is shown in Fig. 13. Each 
block is thus extracted and identification numbers corre- 
sponding to the various attributes as well as the position 
information are obtained as area partitioning data on a 
per-block basis. The structure of the area data regard- 
ing this document in this case can be assumed to be the 
same as that shown in Fig. 6. The area data of each of 
the extracted blocks is then registered and saved as the 
"Document A Template" in the manner illustrated in Fig. 
8. 

[0030] tt should be noted that the processing for 
extracting areas from the image of an input document 
using the template registered as set forth above is as 
described earlier in conjunction with the flowchart of 
Fig. 3. 

[0031] Thus, in accordance with the arrangement 
described above, a document to serve as a template 
need only be read by the scanner 101 to generate the 
template automatically. This enhances operability. 

[Third Embodiment] 

[0032] In the second embodiment, all areas extracted 
by area partitioning processing in template save 
processing are saved as a template. However, it can be 
so arranged that only a desired area among the 
extracted areas is selected and saved as a template. In 
the third embodiment, specific blocks among blocks 
extracted in area partitioning processing are designated 
by the mouse 3 and only the area partitioning data of 
these blocks are saved as a template. 
[0033] Fig. 14 is a flowchart illustrating the procedure 
of processing for saving a template according to the 
third embodiment. The flow of processing of the third 
embodiment will now be described in accordance with 
the flowchart of Fig. 14. 

[0034] The document A (see Fig. 4) having the format 
desired to be saved is read in and converted to binary 
image data by the scanner 1 01 at step S1401 . The input 
image obtained is then subjected to area partitioning 
processing at step Si 402 to extract various areas 
(blocks) such as a text area, figure area, table area and 
title area. The result of this extraction is as described 
above in connection with the second embodiment (Fig. 
13). By way of example, area partitioning processing 
indicated at step S303 in Fig. 3 can be used as the area 
partitioning processing at step S1402. 
[0035] This is followed by step S1403, at which spe- 
cific blocks are selected from the extracted blocks using 
the mouse 103. Fig. 15 is a diagram showing a state in 
which a block 3 and a block 5 have been selected. In the 
case of the example shown in Fig. 15, the selected 



blocks are indicated by hatching so as to be distin- 
guished from other blocks. 

[0036] Next, at step S1404, only the area partitioning 
data (attributes and position information, etc.) of the 
5 blocks selected at step S1403 is saved as a template. In 
this example, only the area partitioning data of blocks 3 
and 5 is saved as the document A template in the man- 
ner shown in Fig. 16. 

[0037] A case where document B shown in Fig. 9 is 
10 processed using the document A template saved in the 
manner explained above will now be described. Fig. 17 
is a diagram useful in describing results of comparing 
the template and each block of a document in a case 
where desired areas are extracted from the document B 
is using the template composed solely of the blocks 
selected in Fig. 15. If the area extraction processing of 
Fig. 3 is executed in the case of this example, only the 
area partitioning data of blocks C and E is output. The 
blocks C and E among the blocks (indicated by the bro- 
20 ken lines) of document B are judged to be areas that at 
least partially overlap the blocks (indicated by the 
dashed lines) recorded in the template and that have 
the same attributes as those of the template blocks. 
[0038] Thus, in accordance with the third embodi- 
es merit, as described above, desired areas can be 
selected from automatically extracted area data and the 
selected areas can be saved as a template. 

[Fourth Embodiment] 

30 

[0039] In the third embodiment, areas to be saved as 
a template are designated by the operator. However, it 
goes without saying that areas that are not to be saved 
as a template may be designated by the operator. In the 
35 fourth embodiment, a desired area among blocks that 
have been extracted by area partitioning processing is 
designated by a mouse or the like and area data of 
blocks other than the designated block are saved as a 
template. 

40 [0040] Fig. 1 8 is a flowchart for describing the proce- 
dure of processing for saving a template according to 
the fourth embodiment. The document A (see Fig. 4) 
having the format desired to be saved is read in and 
converted to binary image data by the scanner 101 at 

45 step S1801 . The input image obtained is then subjected 
to area partitioning processing at step S1802 to extract 
various areas (blocks) such as a text area, figure area, 
table area and title area. This extraction operation pro- 
vides results already described above in connection 

so with the second embodiment (Fig. 1 3). 

[0041] Next, at step S1803, desired blocks are 
selected from among the extracted blocks by using the 
mouse 103. For example, in Fig. 15 described above, 
blocks 3 and 5 are illustrated as being in the selected 

55 state. As shown in Fig. 15. the selected blocks are indi- 
cated by hatching so as to be distinguished from other 
blocks. Whereas these selected blocks were registered 
as a template in the third embodiment, the area data of 
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these selected blocks is deleted in accordance with the 
fourth embodiment. Though the interior of a rectangular 
area of a selected and deleted block is hatched in the 
description given above, the method of representing a 
selected area is not limited to this expedient. For exam- 5 
pie, an arrangement may be adopted in which the frame 
border indicating the rectangle of the block is erased 
along with the deletion of the area data. 
[0042] Next, at step S1804, only the area partitioning 
data (attributes and position information, etc.) of the w 
blocks selected at step Si 803 is saved as a template. In 
this example, only the area partitioning data of blocks 1 , 
2 and 4 is saved as the document A template as the 
result of the selections shown in Fig. 15. 
[0043] The extraction of areas from an input document is 
image using a template thus obtained is as already 
described above with reference to the flowchart of Fig. 3 
in the first embodiment. 

[Fifth Embodiment] 20 

[0044] In each of the foregoing embodiments, docu- 
ments to be processed are placed in the scanner 101 
and read in one by one. However, automatic extraction 
by template in a case where a large number of docu- 25 
ments have been entered by the scanner 101 using an 
ADF (Automatic Document Feeder) also is possible. 
Fig. 19 is a flowchart for describing the procedure of 
area extraction processing according to a fifth embodi- 
ment of the present invention. The flow of processing 30 
will now be descrtoed in accordance with the flowchart 
of Fig. 19. It should be noted that the processing of 
steps S1901, SI 903 - Si 907 is the same as that of 
steps S301 - S306 in the first embodiment. 
[0045] The method of template registration may 35 
employ the technique of any of the first through fourth 
embodiments described above. 
[0046] A desired template to be used to extract 
desired areas is selected at step Si 901. Next, it is 
determined at step SI 902 whether there is a document 40 
to be input, i.e., whether there is a document in the ADF. 
If the decision rendered is "YES", control proceeds to 
step S1 903, where the document is read by the scanner 
and converted to binary image data by the scanner 101 . 
If the decision at step S1902 is "NO", however, then this 45 
processing is terminated. 

[0047] Next, the input image obtained is subjected to 
area partitioning processing at step S1904 to extract 
blocks. For example, if document B shown in Fig. 9 was 
read in at step S1903, then results of the kind shown in so 
Fig. 10 are obtained by the area partitioning processing 
executed at step S1904. 

[0048] Next, at step S1 905, area data of a blocks that 
have been extracted by the area partitioning processing 
of step S1904 are compared with area data of blocks 55 
that have been saved in the template elected at step 
S1901. This is followed by step S1906, namely by the 
extraction of a block whose area at least partially over- 



laps the area of a block in the template and has the 
same attribute as that of the area it overlaps. Here the 
extracted block is construed to be a block identical with 
the desired block defined in the template and the area 
partitioning data of this block is output (step S1907). 
[0049] For example, if the document A template 
obtained based upon document A shown in Fig. 5 is 
selected and the document B shown in Fig. 9 is read in 
by the scanner 101 and processed, then areas overlap 
as shown in Fig. 1 1 . (Blocks that have been recorded as 
the document A template are indicated by dashed lines, 
and blocks that have been extracted from the image of 
document B are indicated by broken lines.) Since the 
attributes of the blocks whose areas overlap each other 
are the same (see Figs. 5 and 10), the data of the areas 
of all blocks in the document B obtained in Fig. 1 0 is out- 
put. Control then returns to step Si 902 and processing 
continues. 

[0050] Thus, in accordance with each of the embodi- 
ments as described above, the following advantages 
are obtained when desired areas are extracted from a 
document image by a template: (1) Operator interven- 
tion is reduced; (2) the accuracy of desired area extrac- 
tion is improved; (3) large quantities of documents can 
be processed automatically; (4) operability is enhanced; 
and (5) overall processing time is shortened. 
[0051] Though the block areas are set as rectangular 
areas in each of the foregoing embodiments, the areas 
may have any shape, such as circular or elliptical, as 
long as they are closed areas. 

[0052] The present invention can be applied to a sys- 
tem constituted by a plurality of devices (e.g., a host 
computer, interface, reader, printer, etc.) or to an appa- 
ratus comprising a single device (e.g., a copier or fac- 
simile machine, etc.). 

[0053] Furthermore, it goes without saying that the 
invention is applicable also to a case where the object of 
the invention is attained by supplying a storage medium 
storing the program codes of the software for perform- 
ing the functions of the foregoing embodiments to a sys- 
tem or an apparatus, reading the program codes with a 
computer (e.g., a CPU or MPU) of the system or appa- 
ratus from the storage medium, and then executing the 
program codes. 

[0054] In this case, the program codes read from the 
storage medium implement the novel functions of the 
invention, and the storage medium storing the program 
codes constitutes the invention. 

[0055] Further, the storage medium, such as a floppy 
disk, hard disk, optical disk, magneto-optical disk. CD- 
ROM, CD-R, magnetic tape, non-volatile type memory 
card or ROM can be used to provide the program codes. 
[0056] Furthermore, besides the case where the 
aforesaid functions according to the embodiments are 
implemented by executing the program codes read by a 
computer, it goes without saying that the present inven- 
tion covers a case where an operating system or the like 
running on the computer performs a part of or the entire 
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process in accordance with the designation of program 
codes and implements the functions according to the 
embodiments. 

[0057] It goes without saying that the present inven- 
tion further covers a case where, after the program s 
codes read from the storage medium are written in a 
function extension board inserted into the computer or 
in a memory provided in a function extension unit con- 
nected to the computer, a CPU or the like contained in 
the function extension board or function extension unit io 
performs a part of or the entire process in accordance 
with the designation of program codes and implements 
the function of the above embodiment. 
[0058] Thus, in accordance with the present invention 
as described above, it is possible to reliably extract is 
desired areas from an entered document image while 
employing a template. 

[0059] As many apparently widely different embodi- 
ments of the present invention can be made without 
departing from the spirit and scope thereof, it is to be 20 
understood that the invention is not limited to the spe- 
cific embodiments thereof except as defined in the 
appended claims. 

Claims 25 

1 . An image processing apparatus comprising: 

holding means for holding position, size and 

attribute as template information in regard to 30 

one or a plurality of areas in an image; 

image input means for inputting a document 

image; 

first extraction means for extracting block areas 
from the document image input by said image 35 
input means and evaluating attributes of the 
extracted block areas; and 
second extraction means for extracting, from 
block areas that have been extracted by said 
first extraction means, a block area that at least 40 
partially overlaps an area indicated by the tem- 
plate information and whose attribute agrees 
with the attribute included in the template infor- 
mation. 

45 

2. The apparatus according to claim 1 , wherein said 
holding means holds template information of a plu- 
rality of types; 

said apparatus further comprising selection so 
means for selecting desired template informa- 
tion from the template information of the plural- 
ity of types held by said holding means; 
said first extraction means extracting areas 
from the document image using template infor- 55 
mation that has been selected by said selection 
means. 



3. The apparatus according to claim 1 , wherein said 
holding means includes: 

setting means for setting areas and attributes 
with respect to the input image; and 
registration means for registering, as template 
information, position, size and attribute of each 
area set by said setting means. 

4. The apparatus according to claim 3, wherein said 
setting means includes: 

display means for displaying the input image; 
and 

designation means for allowing a user to desig- 
nate a desired area and attribute on the image 
displayed by said display means; 
position and size of an area designated using 
said designation means and the designated 
attribute serving as template information. 

5. The apparatus according to claim 3, wherein said 
setting means sets an area and attribute, which are 
to serve as template information, by extracting a 
block area and its attributes from the image that has 
been input by said first extraction means. 

6. The apparatus according to claim 1 , wherein said 
holding means includes: 

area acqisition means for acquiring a block 
area and its attribute from an image that has 
been input using said first extraction means; 
area selection means for selecting a desired 
block area from a rectangular area obtained by 
said area acquisition means; and 
registration means for registering as template 
information the block area and its attribute 
selected by said area selection means. 

7. The apparatus according to claim 6, wherein said 
registration means registers as template informa- 
tion a block area, as well as its attribute, other than 
a block area that has been selected by said area 
selection means. 

8. The apparatus according to claim 1 , wherein said 
second extraction means extracts, from block areas 
that have been extracted by said first extraction 
means, a block area that overlaps, in excess of a 
predetermined ratio, an area indicated by the tem- 
plate information and whose attribute agrees with 
the attribute included in the template information. 

9. An image processing method comprising: 

a holding step of holding position, size and 
attribute as template information in regard to 
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one or a plurality of areas in an image; 
an image input step of inputting a document 
image; 

a first extraction step of extracting block areas 
from the document image input at said image s 
input step and evaluating attributes of the 
extracted block areas; and 
a second extraction step of extracting, from 
block areas that have been extracted at said 
first extraction step, a block area that at least 10 
partially overlaps an area indicated by the tem- 
plate information and whose attribute agrees 
with the attribute included in the template infor- 
mation. 

15 

10. The method according to claim 9, wherein said 
holding step holds template information of a plural- 
ity of types; 

said method further comprising a selection 20 
step of selecting desired template information 
from the template information of the plurality of 
types held at said holding step; 
said first extraction step extracting areas from 
the document image using template informa- 25 
tion that has been selected at said selection 
step. 

11. The method according to claim 9, wherein said 
holding step includes: 30 

a setting step of setting areas and attributes 
with respect to the input image; and 
a registration step of registering, as template 
information, position, size and attribute of each 35 
area set at said setting step. 

12. The method according to claim 11, wherein said 
setting step includes: 

40 

a display step of displaying the input image; 
and 

a designation step of allowing a user to desig- 
nate a desired area and attribute on the image 
displayed at said display step; 45 
position and size of an area designated at said 
designation step and the designated attribute 
serving as template information. 

13. The method according to claim 11, wherein said so 
setting step sets an area and attribute, which are to 
serve as template information, by extracting a block 
area and its attributes from the image that has been 
input at said first extraction step. 

55 

14. The method according to claim 9, wherein said 
holding step includes: 



14 

an area acquisition step of acquiring a block 
area and its attribute from an image that has 
been input using said first extraction step; 
an area selection step of selecting a desired 
block area from a rectangular area obtained at 
said area acquisition step; and 
a registration step of registering as template 
information the block area and its attribute 
selected at said area selection step. 

15. The method according to claim 14, wherein said 
registration step registers as template information a 
block area, as well as its attribute, other than a 
block area that has been selected at said area 
selection step. 

1 6. The method according to claim 9, wherein said sec- 
ond extraction step extracts, from block areas that 
have been extracted at said first extraction step, a 
block area that overlaps, in excess of a predeter- 
mined ratio, an area indicated by the template infor- 
mation and whose attribute agrees with the 
attribute included in the template information. 

17. A storage medium storing a control program for 
causing a computer to extract areas from an input 
image, said control program comprising: 

code of a holding step of holding position, size 
and attribute as template information in regard 
to one or a plurality of areas in an image; 
code of an image input step of inputting a doc- 
ument image; 

code of a first extraction step of extracting block 
areas from the document image input at said 
image input step and evaluating attributes of 
the extracted block areas; and 
code of a second extraction step of extracting, 
from block areas that have been extracted at 
said first extraction step, a block area that at 
least partially overlaps an area indicated by the 
template information and whose attribute 
agrees with the attribute included in the tem- 
plate information. 

18. Image processing apparatus comprising means for 
storing template data relating to size, position and 
attributes of different areas of an image plane, 
means for extracting areas from an image and for 
comparing the extracted areas with the template 
data so as to extract as a block area any area of the 
image which overlaps an area of the template data 
and which has the same attribute as the overlapped 
template area. 
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